CPSC 545/445 (Autumn 2003) - Class 17:  Gene Expression Analysis (1)
Module 6, Part 1

----
6.0 Introduction

Motivation:

measure / analyse gene expression and regulatorty relationships in order to 
- investigate/understand gene function
- investigate/understand interaction between genes
- diagnose diseases
- trreat / monitor diseases
- engineer organisms


recall:

Most genes are expressed in two steps: DNA -> mRNA -> proteins
Regulation mechanisms in the cell control which genes are 
expressed under given conditions.

Idea: Measure mRNA levels to provide a detailed molecular 
  view of gene expression.

Potential problem:
  mRNA levels are not always well correlated with protein levels,
	(postranscriptional regulation, e.g., mRNA editing/degradation)
  protein functionality may be affected by 
	posttranslational modifications (e.g., posphorylation)

How is it done?
- Extract mRNA samples (target) from cell.
  note: many (thousands) of genes may be co-expressed
	-> mixture of many different mRNAs
- detect/characterise mRNAs in sample

Ideal solution:
- sequence all mRNA transcripts in sample (quantitatively)
  -> currently unrealistic, mainly due to number of molecules
	and limitations of sequencing technology
  -> need massively parallel approach for detecting/characterising 
	transcripts

Realistic solutions:
1) use specific probes (usually short DNA strands) to detect various mRNAs
	(by specific hybridisation)
	-> microarray techniques

2) extract & sequence short sequence tags (quantitatively)
	-> SAGE

Also: new techniques for detecting proteins directly
	(protein arrays; based on various capturing techniques,
	such as antigen-antibody, specific protein/protein, 
	adaptamer/protein, receptor/ligand, enzyme/sustrate interactions)
	(not covered here)


---
6.1 technologies for gene expression analysis 

key idea underlying DNA microarray technology:

use DNA complementary to (parts of) mRNA (probes)
to capture mRNAs (target) through specific hybridisation (base pairing)

(could use RNA, but DNA is easier to handle experimentally than RNA and more stable)

two types:
- cDNA microarrays
- oligonucleotide microarrays

result: relative abundance levels for all mRNAs


--
cDNA microarrays 

cDNA = DNA complementary to mRNA.

How to determine cDNA:
- synthesize based on sequence data (e.g.,, from databases)
- from wet library

probes = cDNAs, 500~5,000 bases long

probes are attached to solid surface (e.g., glass) using robot spotting; 

  current technology can generate arrays with > 10k probes / cm^2

chip is exposed to specific mRNA targets or mixture. 

mRNAs come from two sources (cell types / states): 
- test
- reference
distinguished using flourescent tags (greed/red)

pool flourescently tagged mRNA pools from both test and reference 
washed over the surface of array
-> competitive hybridisation (rel amount captures is 
	proportional to rel abundance in mixture)
determine relative amount of transcript present in the pool 
	for each of he two cell types

optically scan image of array using microarray scanner
	(based on laser excitation, white light + filters, etc.)

analyse image to obtain (relative) gene expression levels,
	signals and background can be separated and quantified.

brightness of spot is positively correlated with amount of target captured
-> bright green = high in test, low in reference
	bright red = low in test, high in reference

sources of error:
- differences in amount of nucleic acid in target preparation
- variations of labelling efficiency
- variations in efficiency of hybridisation
- variations in washing efficiency (e.g., due to mishybridisations)

how to deal with these?
- consider ratio of intensity for test, references instead
	of absolute values (reduces error due to 
	sequence specific variances in hybridisation efficiency, etc)
- calibration techniques
	(e.g., use calibration spots to capture identical samples
	labelled red and green)
- various normalisation techniques 
	(e.g., normalisation for global variation across entire array)
- confirmation of specific results with other methods (e.g., Northern blotting)

Result: relative abundance levels for target mRNAs corresponding to all
	probes on array (vector of real numbers)


--
oligonucleotide microarrays

probes = 20~80-mer oligonucleotides or peptide nucleic acid (PNA) 

probes are synthesized either in situ (on-chip) 
	or by conventional synthesis followed by on-chip immobilization

high densities can be achieved using photolithographic fabrication techniques
	(similar to silicon chips)
	-> 40-160k spots / cm^2; ~200 microns/spot

other approaches to fabricating oligonucleotide microarrays
exist, e.g., in-situ synthesis or depositioning technologies.


[see http://www.gene-chips.com/]

-> Affymetrix GeneChips(TM)

array is basically used in the same way as cDNA array,
only that usually only one sample (instead of test + reference)
is hybridised to a single array.


---
Resources:

* http://latin.arizona.edu/~dgalbrai/pls439/deyholos1.pdfhttp://latin.arizona.edu/~dgalbrai/pls439/deyholos1.pdf
http://www.gene-chips.com/
http://www.icsi.berkeley.edu/~epxing/lecture23.pdf

R. Ekins and F.W. Chu. Microarrays: their origins and applications. 
  Trends in Biotechnology, 1999, 17, pp.217-218.

http://www.tulane.edu/~biochem/lecture/723/est.html
-> EST analysis

---